Adversarial Cross-Domain Action Recognition with Co-Attention
نویسندگان
چکیده
منابع مشابه
Enhancing Action Recognition by Cross-Domain Dictionary Learning
Our work is inspired by two facts of the human vision system. The first fact is that humans are able to learn tens of thousands of visual categories in their life, which leads to the hypothesis that humans achieve such a capability by accumulated information and knowledge. Another fact is that human’s visual impressions towards the same action or the same object comes from a wide range, e.g., a...
متن کاملLearning to Discover Cross-Domain Relations with Generative Adversarial Networks
While humans easily recognize relations between data from different domains without any supervision, learning to automatically discover them is in general very challenging and needs many ground-truth pairs that illustrate the relations. To avoid costly pairing, we address the task of discovering cross-domain relations given unpaired data. We propose a method based on generative adversarial netw...
متن کاملAction Recognition using Visual Attention
We propose a soft attention based model for the task of action recognition in videos. We use multi-layered Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units which are deep both spatially and temporally. Our model learns to focus selectively on parts of the video frames and classifies videos after taking a few glimpses. The model essentially learns which parts in the fram...
متن کاملHashGAN: Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval
As the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention. Deep-networks-based cross-modal hashing methods are appealing as they can integrate feature learning and hash coding into end-to-end trainable frameworks. However, it is still challenging to find content similarities between different modalities of data due to the heterogenei...
متن کاملJoint Network based Attention for Action Recognition
By extracting spatial and temporal characteristics in one network, the two-stream ConvNets can achieve the state-ofthe-art performance in action recognition. However, such a framework typically suffers from the separately processing of spatial and temporal information between the two standalone streams and is hard to capture long-term temporal dependence of an action. More importantly, it is in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2020
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v34i07.6854